One variable split into multiple columns can be solved with pivot_longer
step1 <-pivot_longer( cities_untidy, # the tibblecols = Turkey_Istanbul:France_Paris, # the columns to pivot from:tonames_to ="location", # name of the new columnvalues_to ="value") # name of the value column
One variable split into multiple columns can be solved with pivot_longer
step1 <-pivot_longer( cities_untidy, # the tibblecols = Turkey_Istanbul:France_Paris, # the columns to pivot from:tonames_to ="location", # name of the new columnvalues_to ="value") # name of the value column
Another way to select the columns to pivot:
step1 <-pivot_longer( cities_untidy, # the tibblecols =!type, # All columns except type#<<names_to ="location", # name of the new columnvalues_to ="value") # name of the value column
separate_wider_delim()
Multiple variable values that are united into one can be separated using separate_wider_delim
#> # A tibble: 20 × 3
#> type location value
#> <chr> <chr> <dbl>
#> 1 population_size Turkey_Istanbul 15100000
#> 2 population_size Russia_Moscow 12500000
#> # ℹ 18 more rows
step2 <-separate_wider_delim( step1, # the tibble location, # the column to separatedelim ="_", # the separatornames =c("country", "city_name")) # names of new columns
#> # A tibble: 20 × 4
#> type country city_name value
#> <chr> <chr> <chr> <dbl>
#> 1 population_size Turkey Istanbul 15100000
#> 2 population_size Russia Moscow 12500000
#> # ℹ 18 more rows
The opposite function exists as well and is called unite. Check out ?unite for details.
pivot_wider()
One observation split into multiple rows can solved with pivot_wider
#> # A tibble: 20 × 4
#> type country city_name value
#> <chr> <chr> <chr> <dbl>
#> 1 population_size Turkey Istanbul 15100000
#> 2 population_size Russia Moscow 12500000
#> # ℹ 18 more rows
step3 <-pivot_wider( step2, # the tibblenames_from = type, # the variablesvalues_from = value) # the values
#> # A tibble: 10 × 4
#> country city_name population_size city_area
#> <chr> <chr> <dbl> <dbl>
#> 1 Turkey Istanbul 15100000 2576
#> 2 Russia Moscow 12500000 2561
#> 3 UK London 9000000 1572
#> 4 Russia Saint Petersburg 5400000 1439
#> 5 Germany Berlin 3800000 891
#> # ℹ 5 more rows
All steps in 1
We can also use a pipe to do all these steps in one:
# drop rows with missing values in any columndrop_na(and_vertebrates)# drop rows with missing values in weight columndrop_na(and_vertebrates, weight_g)# drop rows with missing values in weight and species columnsdrop_na(and_vertebrates, weight_g, species)
This is an easier and more intuitive alternative to filter(!is.na(...)).